Skip to content

Added socket keepalive#10

Open
si-hb wants to merge 6 commits into
dudest:mainfrom
si-hb:main
Open

Added socket keepalive#10
si-hb wants to merge 6 commits into
dudest:mainfrom
si-hb:main

Conversation

@si-hb
Copy link
Copy Markdown

@si-hb si-hb commented Apr 2, 2025

Nice work on these modules, thank you.

We found that updating the DSP file would break the connection permanently until the flows were restarted. Added socket keepalive and timeout handling so the module reconnects on its own if the DSP is updated.

Tested as working in a live environment.

@dudest
Copy link
Copy Markdown
Owner

dudest commented Apr 3, 2025

Thank you for contributing to this project. I'll test, merge, and push update to NPM soonest.

@dudest
Copy link
Copy Markdown
Owner

dudest commented Apr 6, 2025

Hi @si-hb,

I've tested the changes and it looks like reconnect is working but several other issues are introduced.

  • The timeout event handler keeps on getting triggered even though the module seems to be reconnected.
  • Feedback subscription is not done on reconnect.
  • node status is not updated. (nice to have)

I think we can build on your work but I fear publishing it as is will add to the issues list.

6 Apr 10:05:25 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:25.231Z] rx: {"jsonrpc":"2.0","method":"ChangeGroup.Poll","params":{"Id":"group 1","Changes":[]}}
6 Apr 10:05:25 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:25.496Z] rx: {"jsonrpc":"2.0","method":"ChangeGroup.Poll","params":{"Id":"group 1","Changes":[]}}
6 Apr 10:05:25 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:25.746Z] rx: {"jsonrpc":"2.0","method":"ChangeGroup.Poll","params":{"Id":"group 1","Changes":[]}}
6 Apr 10:05:25 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:25.834Z] socket end
6 Apr 10:05:25 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:25.835Z] socket closed
6 Apr 10:05:26 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:26.840Z] Attempting to reconnect (attempt 1)...
6 Apr 10:05:26 - [error] [qsys-core:855ca905e4aba6c3] Error: connect ECONNREFUSED 127.0.0.1:1710
6 Apr 10:05:26 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:26.853Z] socket closed
6 Apr 10:05:28 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:28.860Z] Attempting to reconnect (attempt 2)...
6 Apr 10:05:28 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:28.861Z] Attempting to reconnect (attempt 3)...
6 Apr 10:05:28 - [error] [qsys-core:855ca905e4aba6c3] Error: connect ECONNREFUSED 127.0.0.1:1710
6 Apr 10:05:28 - [error] [qsys-core:855ca905e4aba6c3] Error: connect ECONNREFUSED 127.0.0.1:1710
6 Apr 10:05:28 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:28.862Z] socket closed
6 Apr 10:05:28 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:28.862Z] socket closed
6 Apr 10:05:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:36.869Z] Attempting to reconnect (attempt 4)...
6 Apr 10:05:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:36.871Z] Attempting to reconnect (attempt 5)...
6 Apr 10:05:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:36.874Z] Attempting to reconnect (attempt 6)...
6 Apr 10:05:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:05:36.875Z] Attempting to reconnect (attempt 7)...
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.891Z] socket timeout detected, destroying socket
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.892Z] socket timeout detected, destroying socket
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.892Z] socket timeout detected, destroying socket
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.893Z] socket timeout detected, destroying socket
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.895Z] socket closed
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.895Z] socket closed
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.895Z] socket closed
6 Apr 10:06:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:06.895Z] socket closed
6 Apr 10:06:14 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:14.705Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":false}}
6 Apr 10:06:16 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:16.440Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":true}}
6 Apr 10:06:19 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:19.482Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":true}}
6 Apr 10:06:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:36.438Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":true}}
6 Apr 10:06:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:36.904Z] Attempting to reconnect (attempt 8)...
6 Apr 10:06:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:36.906Z] Attempting to reconnect (attempt 9)...
6 Apr 10:06:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:36.907Z] Attempting to reconnect (attempt 10)...
6 Apr 10:06:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:36.907Z] Attempting to reconnect (attempt 11)...
6 Apr 10:06:38 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:38.807Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":false}}
6 Apr 10:06:42 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:06:42.259Z] tx: {"jsonrpc":"2.0","id":1234,"method":"Control.Set","params":{"Name":"Mixer4x2Input4Mute","Value":true}}
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.912Z] socket timeout detected, destroying socket
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.913Z] socket timeout detected, destroying socket
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.913Z] socket timeout detected, destroying socket
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.914Z] socket closed
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.914Z] socket closed
6 Apr 10:07:06 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:06.914Z] socket closed
6 Apr 10:07:12 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:12.262Z] socket timeout detected, destroying socket
6 Apr 10:07:12 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:12.263Z] socket closed
6 Apr 10:07:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:36.920Z] Attempting to reconnect (attempt 12)...
6 Apr 10:07:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:36.921Z] Attempting to reconnect (attempt 13)...
6 Apr 10:07:36 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:36.922Z] Attempting to reconnect (attempt 14)...
6 Apr 10:07:42 - [info] [qsys-core:855ca905e4aba6c3] [2025-04-06T14:07:42.271Z] Attempting to reconnect (attempt 15)...

Core improvements:
- Fix socket timeout issue by reducing NoOp interval from 30s to 15s
- Cap exponential reconnection backoff at 60 seconds (was ~4.3 minutes)
- Fix child nodes not updating status after reconnection
- Ensure data/connect handlers attached to all sockets (initial and reconnected)
- Add proper cleanup of NoOp intervals before starting new ones

Connection state improvements:
- Change logging from trace() to log() for visibility
- Add state change events (connected/connecting/disconnected/error) emitted to child nodes
- Update child nodes to listen to server events instead of socket events directly
- Fix node status updates on reconnection (now shows green dot when reconnected)

Reconnection logic:
- Exponential backoff: 1s, 2s, 4s, 8s, 16s, 32s, 60s (capped)
- Log suppression after 6 failed attempts to prevent flooding
- Continue reconnecting indefinitely at 60s intervals
- Reset attempts counter on successful connection

Testing additions:
- Add test_mixer.json flow for Mixer4x2 controls testing
- Configure for 192.168.1.191 with logCommunications disabled by default

Bug fixes:
- Fix socket timeout race condition (NoOp vs socket timeout timing)
- Fix multiple NoOp intervals running after reconnection
- Fix data handler not attached to reconnected sockets
- Fix child nodes stuck in error state after reconnection
@zdenekstepanek
Copy link
Copy Markdown

Any news, please? I still have a problem in my cinema where ESPHome display is blind (dark) until I move any node by 1px and re-deploy NodeRed.

@AndyFug
Copy link
Copy Markdown

AndyFug commented May 14, 2026

Hi. have tried to use this PR, but unfortunately it caused my NodeRED runtime to crash:

node-red-1  | 14 May 13:22:00 - [error] [qsys-core:1614f34551a44613] [2026-05-14T12:22:00.047Z] Socket error: Error: connect EHOSTUNREACH 10.1.1.86:1710
node-red-1  | 14 May 13:22:00 - [info] [qsys-core:1614f34551a44613] [2026-05-14T12:22:00.048Z] Connection state changed from connecting to error: connect EHOSTUNREACH 10.1.1.86:1710
node-red-1  | [red] Uncaught Exception:
node-red-1  | 14 May 13:22:00 - [error] Error [ERR_UNHANDLED_ERROR]: Unhandled error. ('connect EHOSTUNREACH 10.1.1.86:1710')
node-red-1  |     at QsysCoreNode.emit (node:events:513:17)
node-red-1  |     at Node.emit (/usr/src/node-red/node_modules/@node-red/runtime/lib/nodes/Node.js:192:20)
node-red-1  |     at updateNodeStatus (/data/node_modules/node-red-contrib-qsys/lib/qsys-core.js:306:26)
node-red-1  |     at Socket.<anonymous> (/data/node_modules/node-red-contrib-qsys/lib/qsys-core.js:71:17)
node-red-1  |     at Socket.emit (node:events:524:28)
node-red-1  |     at emitErrorNT (node:internal/streams/destroy:169:8)
node-red-1  |     at emitErrorCloseNT (node:internal/streams/destroy:128:3)
node-red-1  |     at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

I think it's because of this.emit('error', message); in updateNodeStatus. Link to method here:

https://github.com/si-hb/node-red-contrib-qsys/blob/46d5f6514040f3e8c6767866eb9648053e1b7b0c/lib/qsys-core.js#L286-L313

I don't think there is an upstream handler for this (by design), so it crashes nodeRED. I haven't written a nodeRED module before, but from a brief bit of reading, I think errors should be emitted using the API this.error(...) to log out to debug/console etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants